113 research outputs found

    Exploring compression techniques for ROOT IO

    Get PDF
    ROOT provides an flexible format used throughout the HEP community. The number of use cases - from an archival data format to end-stage analysis - has required a number of tradeoffs to be exposed to the user. For example, a high "compression level" in the traditional DEFLATE algorithm will result in a smaller file (saving disk space) at the cost of slower decompression (costing CPU time when read). At the scale of the LHC experiment, poor design choices can result in terabytes of wasted space or wasted CPU time. We explore and attempt to quantify some of these tradeoffs. Specifically, we explore: the use of alternate compressing algorithms to optimize for read performance; an alternate method of compressing individual events to allow efficient random access; and a new approach to whole-file compression. Quantitative results are given, as well as guidance on how to make compression decisions for different use cases.Comment: Proceedings for 22nd International Conference on Computing in High Energy and Nuclear Physics (CHEP 2016

    Continuous Performance Benchmarking Framework for ROOT

    Get PDF
    Foundational software libraries such as ROOT are under intense pressure to avoid software regression, including performance regressions. Continuous performance benchmarking, as a part of continuous integration and other code quality testing, is an industry best-practice to understand how the performance of a software product evolves over time. We present a framework, built from industry best practices and tools, to help to understand ROOT code performance and monitor the efficiency of the code for a several processor architectures. It additionally allows historical performance measurements for ROOT I/O, vectorization and parallelization sub-systems.Comment: 8 pages, 5 figures, CHEP 2018 - 23rd International Conference on Computing in High Energy and Nuclear Physic

    Discovering Job Preemptions in the Open Science Grid

    Full text link
    The Open Science Grid(OSG) is a world-wide computing system which facilitates distributed computing for scientific research. It can distribute a computationally intensive job to geo-distributed clusters and process job's tasks in parallel. For compute clusters on the OSG, physical resources may be shared between OSG and cluster's local user-submitted jobs, with local jobs preempting OSG-based ones. As a result, job preemptions occur frequently in OSG, sometimes significantly delaying job completion time. We have collected job data from OSG over a period of more than 80 days. We present an analysis of the data, characterizing the preemption patterns and different types of jobs. Based on observations, we have grouped OSG jobs into 5 categories and analyze the runtime statistics for each category. we further choose different statistical distributions to estimate probability density function of job runtime for different classes.Comment: 8 page

    Extending ROOT through Modules

    Get PDF
    The ROOT software framework is foundational for the HEP ecosystem, providing capabilities such as IO, a C++ interpreter, GUI, and math libraries. It uses object-oriented concepts and build-time components to layer between them. We believe additional layering formalisms will benefit ROOT and its users. We present the modularization strategy for ROOT which aims to formalize the description of existing source components, making available the dependencies and other metadata externally from the build system, and allow post-install additions of functionality in the runtime environment. components can then be grouped into packages, installable from external repositories to deliver post-install step of missing packages. This provides a mechanism for the wider software ecosystem to interact with a minimalistic install. Reducing intra-component dependencies improves maintainability and code hygiene. We believe helping maintain the smallest "base install" possible will help embedding use cases. The modularization effort draws inspiration from the Java, Python, and Swift ecosystems. Keeping aligned with the modern C++, this strategy relies on forthcoming features such as C++ modules. We hope formalizing the component layer will provide simpler ROOT installs, improve extensibility, and decrease the complexity of embedding in other ecosystemsComment: 8 pages, 2 figures, 1 listing, CHEP 2018 - 23rd International Conference on Computing in High Energy and Nuclear Physic

    Designing Computing System Architecture and Models for the HL-LHC era

    Full text link
    This paper describes a programme to study the computing model in CMS after the next long shutdown near the end of the decade.Comment: Submitted to proceedings of the 21st International Conference on Computing in High Energy and Nuclear Physics (CHEP2015), Okinawa, Japa

    Data Access for LIGO on the OSG

    Full text link
    During 2015 and 2016, the Laser Interferometer Gravitational-Wave Observatory (LIGO) conducted a three-month observing campaign. These observations delivered the first direct detection of gravitational waves from binary black hole mergers. To search for these signals, the LIGO Scientific Collaboration uses the PyCBC search pipeline. To deliver science results in a timely manner, LIGO collaborated with the Open Science Grid (OSG) to distribute the required computation across a series of dedicated, opportunistic, and allocated resources. To deliver the petabytes necessary for such a large-scale computation, our team deployed a distributed data access infrastructure based on the XRootD server suite and the CernVM File System (CVMFS). This data access strategy grew from simply accessing remote storage to a POSIX-based interface underpinned by distributed, secure caches across the OSG.Comment: 6 pages, 3 figures, submitted to PEARC1

    Long Term Dynamics for Two Three-Species Food Webs

    Get PDF
    In this paper, we analyze two possible scenarios for food webs with two prey and one predator (a food web is similar to a food chain except that in a web we have more than one species at some levels). In neither scenario do the prey compete, rather the scenarios differ in the selection method used by the predator. We determine how the dynamics depend on various parameter values. For some parameter values, one or more species dies out. For other parameter values, all species co-exist at equilibrium. For still other parameter values, the populations behave cyclically. We have even discovered parameter values for which the system exhibits chaos and has a positive Lyapunov exponent. Our analysis relies on common techniques such as nullcline analysis, equilibrium analysis and singular perturbation analysis

    SciTokens: Capability-Based Secure Access to Remote Scientific Data

    Full text link
    The management of security credentials (e.g., passwords, secret keys) for computational science workflows is a burden for scientists and information security officers. Problems with credentials (e.g., expiration, privilege mismatch) cause workflows to fail to fetch needed input data or store valuable scientific results, distracting scientists from their research by requiring them to diagnose the problems, re-run their computations, and wait longer for their results. In this paper, we introduce SciTokens, open source software to help scientists manage their security credentials more reliably and securely. We describe the SciTokens system architecture, design, and implementation addressing use cases from the Laser Interferometer Gravitational-Wave Observatory (LIGO) Scientific Collaboration and the Large Synoptic Survey Telescope (LSST) projects. We also present our integration with widely-used software that supports distributed scientific computing, including HTCondor, CVMFS, and XrootD. SciTokens uses IETF-standard OAuth tokens for capability-based secure access to remote scientific data. The access tokens convey the specific authorizations needed by the workflows, rather than general-purpose authentication impersonation credentials, to address the risks of scientific workflows running on distributed infrastructure including NSF resources (e.g., LIGO Data Grid, Open Science Grid, XSEDE) and public clouds (e.g., Amazon Web Services, Google Cloud, Microsoft Azure). By improving the interoperability and security of scientific workflows, SciTokens 1) enables use of distributed computing for scientific domains that require greater data protection and 2) enables use of more widely distributed computing resources by reducing the risk of credential abuse on remote systems.Comment: 8 pages, 6 figures, PEARC '18: Practice and Experience in Advanced Research Computing, July 22--26, 2018, Pittsburgh, PA, US
    corecore